home *** CD-ROM | disk | FTP | other *** search
- (c) Copyright 1989-1999 Amiga, Inc. All rights reserved.
- The information contained herein is subject to change without notice, and
- is provided "as is" without warranty of any kind, either express or implied.
- The entire risk as to the use of this information is assumed by the user.
-
-
-
- Troubleshooting Your Software
-
- by Carolyn Scheppner
-
-
- Many Amiga programming errors have classic symptoms. This guide
- gives some tips to help you find and eliminate these problems in
- your software.
-
-
- CLI Error Messages
-
- This is caused by calling exit(n) with an invalid or missing
- return value. Assembler programmers using startup code should
- jump to the startup code's _exit with a valid return value on
- the stack. Programs without startup code should return with a
- valid value in D0. Valid return values are defined in
- libraries/dos.h and i. Other values (-1 for instance) can cause
- CLI error messages such as "not an object module".
-
-
- CLI Won't Close on RUN
-
- A CLI can't close if a program has a lock on the CLI input or
- output stream ("*"). If your program is RUN >NIL: from a CLI,
- that CLI should be able to close unless your code or your
- compiler's startup code explicitly opens "*".
-
-
- Crashes and Memory Corruption at Run Time
-
- Memory corruption, address errors, and illegal instruction errors
- are generally caused by wild pointers - pointers which are
- uninitialized, incorrectly initialized, or point to memory
- oresources which have already been freed or closed. You may be
- accidently modifying or incrementing a pointer later used to free
- memory or close a resource. The pointer may be one that you use
- directly, or indirectly, such as in a structure which is passed
- to a system routine.
-
- Amiga functions which open system resources or allocate memory
- typically return a pointer; if the open or allocate fails, zero
- is usually returned. So you must test the return value of a
- system call for success before using it as a pointer.
-
- Test utilities such as MemWatch and MemMung can help catch the
- use of uninitialized pointers or freed memory. MemMung is a
- torture test which sets freed memory areas and location $0 to
- odd values (if your program is written correctly, this will have
- no effect; if not, your program is more likely to fail).
- MemWatch is a watchdog utility which reports modification of low
- memory (wild pointers often point to low memory, especially $0).
-
- Memory corruption and crashes can also be caused by calling
- functions with the wrong arguments or missing arguments (for
- example SetAPen(3) or SetAPen(win,3), instead of SetAPen(rp,3)
- ). Another possibility is that you might be overflowing your
- stack. The compiler's stack checking option may be able to catch
- this (with Lattice, -v disables stack checking). Cut stack
- usage by dynamically allocating large structures, buffers, and
- arrays if they are currently defined inside main() or your other
- functions. If you are using short integers be sure to explicitly
- type any long constants (e.g. 42L). For example, with short
- integers, the expression 1 << 17 may become zero. If corruption
- is occurring during exit, use printf (or kprintf, etc.) with
- Delay(n) to slow down your cleanup and broadcast each step.
-
- A bad pointer which causes a system crash will often be reported
- as a guru meditation 00000003 or 4. Numbers in the range
- 00000006 - B may also indicate a problem with pointers. These
- numbers correspond to the hardware defined CPU exceptions of
- Motorola's 68000-family of processors. Generally these occur
- when the CPU tries to access a non-existent memory location or
- execute an illegal instruction. Other guru meditation numbers
- are Amiga-specific, but may be caused by wild pointers; for the
- meaning of these codes, refer to the include file exec/alerts.h.
-
-
- Crashes - After Exit
-
- If your program crashes after exiting only when your program is
- started from the Workbench, then you are probably UnLock()ing one
- of the WBStartup message wa_Locks, or UnLock()ing the Lock
- returned from an initial CurrentDir() call. If you call
- CurrentDir() in your application, you should save the first lock
- it returns and then call CurrentDir() on that lock before you
- exit.
-
- If you are crashing from both Workbench and CLI and you are only
- crashing after exit, then you may be freeing or closing something
- twice. Also, you may be freeing or closing something that you
- did not allocate or open.
-
- A crash after your program exits can also be caused by leaving an
- outstanding device IO request or other wakeup request. If you
- send an IO request and then exit, Exec, upon completion of that
- IO request, will send a reply message to a port that no longer
- exists. You must abort and then WaitIO() on any pending IO
- requests before you free things and exit. See the autodocs for
- your device and for Exec AbortIO() and WaitIO(). Similar
- problems can be caused by deleting a subtask that is in a wait
- loop such as WaitTOF(). Only delete subtasks when you are sure
- they are in a safe state such as Wait(0L).
-
-
- Crashes - Subtasks, Interrupts
-
- If part of your code runs on a different stack or on the system
- stack, you must turn off compiler stack-checking options
- (Lattice uses the -v flag to disable stack checking). If part
- of your code is called directly by the system or by other tasks,
- you must use the large code / large data model, or use compiler
- functions or options to assure that the correct base registers
- are set up for your subtask or interrupt code.
-
-
- Crashes - Window Related
-
- Be careful not to call CloseWindow() during a
- while(msg=GetMsg(...)) loop on that window's port because the
- next GetMsg() will be on a freed pointer. Also, use
- ModifyIDCMP(NULL) with care, especially if you are using one port
- with multiple windows. Be sure to ClearMenuStrip() any menus
- before closing a window, and do not free items such as
- dynamically allocated gadgets and menus while they are attached
- to a window.
-
-
- Crashes - Workbench Only
-
- If you are crashing near the first DOS call, either your stack is
- too small or your startup code does not GetMsg() the WBStartup
- message from the process message port. If your program crashes
- only when started from Workbench and your startup code opens no
- stdio window or NIL: file handles for Workbench programs, then
- make sure you are not writing anything to stdout (e.g. printf()
- ) when started from Workbench (argc==0). See also ``Crashes -
- After Exit'' above.
-
-
- Disk Icon Won't Go Away
-
- This occurs when a program leaves a lock on one or more of a
- disk's files or directories.
-
-
- Fails Only On the 68020/30
-
- In general this occurs whenever an application inadvertently
- contains a CPU dependency. The following programming practices
- will lead to programs which fail on the Motorola 68020/030 (but
- run OK on the 68000):
-
- o Using the upper byte of addresses for flags.
- o Doing signed math on addresses.
- o Writing self-modifying code.
- o Using the MOVE SR assembler instruction (use Exec GetCC()
- instead).
- o Using software delay or timing loops.
- o Making assumptions about the order in which asynchronous
- tasks will finish.
-
- Special features of the 68020/30 processors can also cause
- problems for programs written on the 68000. For example, an
- invalid cache entry due to DMA or other non-processor
- modification of data which has already been cached; a different
- exception stack frame; interrupt auto-vectors moved by VBR; the
- 68020/30 CLR instruction which does a single write access unlike
- the 68000 CLR instruction which does a separate read and write
- access (this might effect a read-triggered register in IO space -
- use MOVE instead).
-
-
- Fails Only On the 68000
-
- Again, a program which fails only on certain processors contains
- a CPU dependency. The following programming practices can cause
- this problem:
-
- o Software delay loops.
- o Word or longword access of an odd address (illegal on the
- 68000).
- o Assumptions about the order in which asynchronous tasks will
- finish.
- o Using compiler flags which have generated inline 68881/68882
- math coprocessor instructions or 68020/30 specific code.
- o Using the CLR instruction on a hardware register (it's
- behavior on the 68000 differs from the 68020/030 (use
- MOVE instead) ).
-
-
- Fails Only on Older ROMs or Older Workbench
-
- This can be caused by calling functions or using structures which
- do not exist in the older versions of the operating system. Or
- you may be asking for a library version higher than you need.
- Ask for the lowest version which provides the functions your
- application
-
- requires (usually 33). You should not use the #define
- LIBRARY_VERSION from the include files when you open a library.
- Also make sure you check OpenLibrary() calls for success (a
- non-zero return value). If the library you request is not
- available, exit gracefully and informatively.
-
-
- Fails only on Newer ROMs or Newer Workbench
-
- This should not happen with proper programming. Possible causes
- are:
-
- o Running too close to your stack limits or the memory limits
- of a base machine (newer versions of the operating system
- may use slightly more stack in system calls, and usually use
- more free memory).
- o Using system functions improperly.
- o Not testing function return values.
- o Using improperly initialized pointers
- o Assuming that a system variable (such as a Flags field) is B
- if it is not A.
- o Failing to initialize formerly reserved structure fields to
- zero.
- o Violating Amiga programming guidelines (for example:
- depending on or poking private system structures, jumping
- into ROM, depending on undocumented or unsupported
- behaviors).
- o Failing to read the function autodocs.
-
-
- Fails On CHIP-RAM-Only Machines
-
- This is caused by specifically asking for or requiring MEMF_FAST
- memory. If you don't need chip memory, ask for memory type 0L,
- or MEMF_CLEAR, or MEMF_PUBLIC|MEMF_CLEAR as applicable. If
- there is fast memory available, you will be given fast memory.
- If not, you will get chip memory.
-
-
- Fails Only on Machines with FAST RAM
-
- Data and buffers which will be accessed directly by the custom
- chips must be in chip memory. This includes bitplanes (use
- OpenScreen() or AllocRaster() ), audio samples, trackdisk
- buffers, and the graphic image data for sprites, pointers, bobs,
- images, gadgets, etc. Use compiler or linker flags to force chip
- memory loading of any initialized data that needs to be in chip
- memory. You could also dynamically allocate chip memory and copy
- the initialized data there.
-
-
- Fails Only with Enhanced Chips
-
- This is usually caused by writing or reading addresses past the
- end of register space on older custom chips, or writing a
- non-zero value to bits which are undefined in older chip
- registers, or failing to mask out undefined bits when
- interpreting the value read from a chip register.
-
-
- Fireworks
-
- A dazzling pyrotechnic video display is caused by trashing or
- freeing a copper list which is in use, or trashing the pointers
- to the copper list. If you aren't messing with copper lists, see
- ``Crashes and Memory Corruption''.
-
-
- Graphics - Corrupted Images
-
- The bit data for graphic images such as sprites, pointers, bobs,
- and gadgets must be in chip memory. Check your compiler manual
- for directives or flags which will place your graphic image data
- in chip memory. Alternately you could allocate chip memory and
- copy the graphic image there.
-
-
- Hang - Single Program Only
-
- Program hangs are generally caused by Wait()ing on the wrong
- signal bits, on the wrong port, on the wrong message, or on some
- other event that will never occur. They can also be caused by
- verify deadlocks. Be sure to turn off all Intuition VERIFY
- messages (such as MENUVERIFY) before calling AutoRequest() or
- doing disk access.
-
-
- Hang - Whole System
-
- This is generally caused by a Disable() without a corresponding
- Enable(). It can also be caused by memory corruption, especially
- corruption of low memory. See ``Crashes and Memory Corruption''
- above.
-
-
- Memory Loss
-
- First, make sure that your program is actually causing the memory
- loss. Boot with a normal Workbench disk whose
- s:startup-sequence LoadWB command line has been changed to
- LoadWB -debug. It is important to boot with a standard
- Workbench because some third party applications such as
- background utilities, shells, and network handlers dynamically
- allocate and free memory. Arrange all windows so that part of
- the Workbench backdrop window is accessible and so that no
- window rearrangement will be needed to run your program. Select
- flushlibs from the rightmost Workbench menu. Any disk-loaded
- fonts, libraries, devices, etc. that are not currently open will
- be flushed from memory. Wait a few seconds, then click on the
- Workbench backdrop. Write down the amount of free memory
- displayed in the Workbench title bar. Now without rearranging
- any windows, run your program and use all of the program
- features. Exit your program, wait a few seconds, then click on
- the Workbench backdrop. Now select flushlibs, wait a few
- seconds and write down this final free amount. If this matches
- the first value you wrote down, then your program is fine and is
- not causing a memory loss.
-
- If memory was actually lost and your program can be run from CLI
- or Workbench, then try the above procedure with both methods of
- starting your program. See ``Memory Loss - CLI Only'' and
-
- ``Memory Loss - Workbench Only'' as appropriate.
-
- If you lose memory from both Workbench and CLI, then make sure
- all calls to functions which open/allocate/create/lock have a
- matching call to the corresponding close/free/delete/unlock
- function (there are a few system calls that do not require a
- corresponding free - check the autodocs). Generally, the
- close/free/delete/unlock calls should be in the opposite order of
- the allocations.
-
- If you are losing a small, fixed amount of memory, look for a
- structure of that size in the Structure Offsets listing in the
- Includes and Autodocs manual. For example, a loss of exactly 24
- bytes is probably a Lock which has not been UnLock()ed. If you
- are using ScrollRaster(), be aware that ScrollRaster() left or
- right in a SUPERBITMAP window with no TmpRas will currently lose
- memory (workaround - attach a TmpRas). If you lose much more
- memory when started from Workbench than from the CLI, make sure
- your program is not using Exit(n). This would bypass startup
- code cleanups and prevent a Workbench-loaded program from being
- unloaded. Use exit(n) instead.
-
-
- Memory Loss - CLI Only
-
- Some third-party shells dynamically allocate history buffers, or
- cause other memory fluctuations. Also, if your program executes
- different code when started from CLI, check that code and its
- cleanup. And check your startup.asm if you wrote your own.
-
-
- Memory Loss - Ctrl-C Exit Only
-
- This occurs when you have Amiga-specific resources allocated and
- you have not disabled your compiler's automatic Ctrl-C handling
- (causing all of your program clean-ups to be skipped). Disable
- the compiler's Ctrl-C handling and handle Ctrl-C yourself.
-
-
- Memory Loss - During Execution
-
- A continuing memory loss during execution can be caused by
- failure to keep up with all of the IDCMP messages (such as
- MOUSEMOVE) that you request from Intuition. Intuition can not
- reuse IDCMP message blocks until you call ReplyMsg() on them.
- If your window's allotted message blocks are all in use, new ones
- will be allocated and not freed until the window is closed.
- Continuing memory losses can also be caused by a program loop
- containing an allocation/open type call without a corresponding
- free.
-
-
- Memory Loss - Workbench Only
-
- This is often caused by the failure of your code to unload after
- you exit. Make sure that your code is being linked with a
- correct, standard startup module and do not use the Exit(n)
- function to exit your program. The Exit(n) function will bypass
- your startup code's cleanup, including its ReplyMsg() of the
- WorkbenchStartup message (this signals Workbench to unload your
- program from memory). You should exit via exit(n) where n is a
- valid DOS error code such as RETURN_OK (libraries/dos.h). You
- may also exit with a final closing brace "}" or with the return
- statement. Assembler programmers using startup code can JMP to
- _exit with a long return value on the stack or use the RTS
- instruction.
-
-
- Menu Problems
-
- A flickering menu is caused by leaving a pixel or more space
- between menu subitems in your menu structures. Crashing after
- browsing a menu (looking at menu without selecting any items) is
- caused by not properly handling MENUNULL select messages.
- Multiple selection not working is caused by improper handling of
- the NextSelect field properly. See the Menus chapter from the
- Intuition manual for more details.
-
-
- Out-of-Sync Response to Input
-
- This is caused by failing to handle all received signals or
- messages after waking up from a Wait() or WaitPort() call. More
- than one event or message may have caused your program to be
- awakened. Check the signals returned by Wait() and act on every
- one that is set. At ports which may have more than one message
- (such as a window's IDCMP port) you must handle the messages in a
- while(msg=GetMsg(...)) loop.
-
-
- Performance Loss in Other Processes
-
- This is often caused by a program doing one of the following:
- Busy waiting or polling.
- Running at a higher priority.
- Doing lengthy Forbid()s, Disable()s, or interrupt handling.
-
-
- Sound Samples Won't Play Correctly
-
- The data for audio samples must be in chip memory. Check your
- compiler manual for directives or flags which will place your
- audio sample data in chip memory. Also, you can dynamically
- allocate chip memory and copy or load the audio sample there.
-
-
- Trackdisk Data Not Transferred
-
- This may occur if your trackdisk buffers are not in chip memory.
-
-
- Windows - Borders Flicker after Resize
-
- Set the NOCAREREFRESH flag. Even SMART_REFRESH windows can
- generate refresh events if there is a sizing gadget, so if you
- don't have specific code to handle this, you must set the
- NOCAREREFRESH flag. If you do have refresh code, be sure to use
- the Begin/EndRefresh() calls. Failure to do one or the other
- will leave Intuition in an intermediate state and slow down
- operation for all windows on the screen.
-
-
-
- GENERAL DEBUGGING TECHNIQUES
-
- Isolate the problem by using printf() to find the section of code
- in which the problem occurs. If you cannot display messages on
- the screen, use kprintf() to send messages to the serial port or
- dprintf() for the parallel port (see Linker Library
- documentation). Check the initial values, allocation, use, and
- freeing of all pointers and structures used in the problem area.
- Also make sure that all of your system and internal function
- calls pass correct initialized arguments and that all possible
- error returns are checked for and handled.
-
-
-
- Use Debugging Tools
-
- A variety of debugging tools are available to help locate faulty
- code. There are source level debuggers (such as Lattice's
- CodePRobe), crash interceptors (such as GOMF), memory watchdogs
- like MemWatch and WatchMem, and other helpful tools like MemMung,
- Avail, WBFrags, etc.
-
-
- Test With Different Configurations
-
- Test your program on a wide variety of systems and
- configurations. Programs with coding errors may appear to work
- properly on one configuration but may fail or cause fatal
- problems on another. Make sure that your code is tested on both
- the 68000 and the 68020/30, on machines with and without fast
- memory, and on machines with and without enhanced chips. Test
- all of your program functions on every machine.
-
-
- Test All Error and Abort Code
-
- A program with missing error checks or unsafe cleanup might work
- fine when all of the items it opens or allocates are available,
- but may fail fatally when an error or problem is encountered.
- Try your code with missing files, filenames with spaces,
- incorrect filenames, cancelled requesters, Ctrl-C, missing
- libraries or devices, low memory, missing hardware, etc.
-
- Test all of your text input functions with international ASCII
- characters (such as the character produced by pressing ALT-F then
- A). Rawkey codes produce different keyboard characters on the
- various national keyboards (higher levels of keyboard input are
- automatically translated to the proper characters). If your
- program will be distributed internationally, support and take
- advantage of the additional screen lines available on a PAL
- system. On A2000s with the enhanced Agnus chip, a PAL display
- can be selected via motherboard jumper J102. Note that a base
- PAL machine will have less memory free due to the larger display
- size.
-